Overview

Dataset statistics

Number of variables35
Number of observations490711
Missing cells25852
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory131.0 MiB
Average record size in memory280.0 B

Variable types

Numeric6
Text7
DateTime1
Categorical21

Alerts

action is highly imbalanced (66.7%)Imbalance
adventure is highly imbalanced (79.5%)Imbalance
animation is highly imbalanced (86.5%)Imbalance
crime is highly imbalanced (72.5%)Imbalance
family is highly imbalanced (80.1%)Imbalance
fantasy is highly imbalanced (84.2%)Imbalance
history is highly imbalanced (84.8%)Imbalance
horror is highly imbalanced (68.7%)Imbalance
music is highly imbalanced (69.8%)Imbalance
mystery is highly imbalanced (83.5%)Imbalance
romance is highly imbalanced (61.3%)Imbalance
science_fiction is highly imbalanced (84.5%)Imbalance
tv_movie is highly imbalanced (77.6%)Imbalance
thriller is highly imbalanced (66.6%)Imbalance
war is highly imbalanced (89.3%)Imbalance
western is highly imbalanced (90.3%)Imbalance
budget_category is highly imbalanced (84.1%)Imbalance
release_date has 25852 (5.3%) missing valuesMissing
vote_count is highly skewed (γ1 = 26.99360862)Skewed
revenue is highly skewed (γ1 = 41.80162294)Skewed
runtime is highly skewed (γ1 = 79.05066591)Skewed
budget is highly skewed (γ1 = 26.30608203)Skewed
vote_average has 252311 (51.4%) zerosZeros
vote_count has 252216 (51.4%) zerosZeros
revenue has 457493 (93.2%) zerosZeros
budget has 457493 (93.2%) zerosZeros

Reproduction

Analysis started2024-07-13 23:32:47.636352
Analysis finished2024-07-13 23:33:24.329848
Duration36.69 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

id
Real number (ℝ)

Distinct490668
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean575442.92
Minimum2
Maximum1307734
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:24.376092image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile40724.5
Q1269349.5
median543208
Q3874500
95-th percentile1209726.5
Maximum1307734
Range1307732
Interquartile range (IQR)605150.5

Descriptive statistics

Standard deviation370153.49
Coefficient of variation (CV)0.64324971
Kurtosis-1.0620735
Mean575442.92
Median Absolute Deviation (MAD)295662
Skewness0.23273366
Sum2.8237617 × 1011
Variance1.3701361 × 1011
MonotonicityNot monotonic
2024-07-13T16:33:24.445274image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1244145 2
 
< 0.1%
1271173 2
 
< 0.1%
1205751 2
 
< 0.1%
1192955 2
 
< 0.1%
1282701 2
 
< 0.1%
1218010 2
 
< 0.1%
1258680 2
 
< 0.1%
1202079 2
 
< 0.1%
1207989 2
 
< 0.1%
1219760 2
 
< 0.1%
Other values (490658) 490691
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
8 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
ValueCountFrequency (%)
1307734 1
< 0.1%
1307731 1
< 0.1%
1307728 1
< 0.1%
1307694 1
< 0.1%
1307684 1
< 0.1%
1307682 1
< 0.1%
1307678 1
< 0.1%
1307677 1
< 0.1%
1307675 1
< 0.1%
1307668 1
< 0.1%

title
Text

Distinct439930
Distinct (%)89.7%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:24.612584image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length324
Median length245
Mean length20.871081
Min length1

Characters and Unicode

Total characters10241669
Distinct characters3407
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique415228 ?
Unique (%)84.6%

Sample

1st rowInception
2nd rowInterstellar
3rd rowThe Dark Knight
4th rowAvatar
5th rowThe Avengers
ValueCountFrequency (%)
the 125897
 
7.0%
of 50838
 
2.8%
a 26794
 
1.5%
24367
 
1.4%
in 20610
 
1.1%
and 17630
 
1.0%
to 12398
 
0.7%
2 11372
 
0.6%
my 9696
 
0.5%
love 8243
 
0.5%
Other values (171642) 1495912
82.9%
2024-07-13T16:33:24.864738image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1313407
 
12.8%
e 941039
 
9.2%
a 673784
 
6.6%
o 576617
 
5.6%
i 553349
 
5.4%
n 529504
 
5.2%
r 515679
 
5.0%
t 468177
 
4.6%
s 419691
 
4.1%
l 334519
 
3.3%
Other values (3397) 3915903
38.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10241669
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1313407
 
12.8%
e 941039
 
9.2%
a 673784
 
6.6%
o 576617
 
5.6%
i 553349
 
5.4%
n 529504
 
5.2%
r 515679
 
5.0%
t 468177
 
4.6%
s 419691
 
4.1%
l 334519
 
3.3%
Other values (3397) 3915903
38.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10241669
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1313407
 
12.8%
e 941039
 
9.2%
a 673784
 
6.6%
o 576617
 
5.6%
i 553349
 
5.4%
n 529504
 
5.2%
r 515679
 
5.0%
t 468177
 
4.6%
s 419691
 
4.1%
l 334519
 
3.3%
Other values (3397) 3915903
38.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10241669
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1313407
 
12.8%
e 941039
 
9.2%
a 673784
 
6.6%
o 576617
 
5.6%
i 553349
 
5.4%
n 529504
 
5.2%
r 515679
 
5.0%
t 468177
 
4.6%
s 419691
 
4.1%
l 334519
 
3.3%
Other values (3397) 3915903
38.2%

vote_average
Real number (ℝ)

ZEROS 

Distinct4972
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9195443
Minimum0
Maximum10
Zeros252311
Zeros (%)51.4%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:24.943195image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q36
95-th percentile8
Maximum10
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.2625736
Coefficient of variation (CV)1.1174941
Kurtosis-1.3540654
Mean2.9195443
Median Absolute Deviation (MAD)0
Skewness0.46002371
Sum1432652.5
Variance10.644387
MonotonicityNot monotonic
2024-07-13T16:33:25.005708image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 252311
51.4%
6 18170
 
3.7%
5 17355
 
3.5%
7 14516
 
3.0%
8 11191
 
2.3%
10 11078
 
2.3%
4 6842
 
1.4%
5.5 6041
 
1.2%
6.5 5922
 
1.2%
2 5802
 
1.2%
Other values (4962) 141483
28.8%
ValueCountFrequency (%)
0 252311
51.4%
0.5 237
 
< 0.1%
0.75 1
 
< 0.1%
0.8 88
 
< 0.1%
0.875 1
 
< 0.1%
0.9 1
 
< 0.1%
1 3775
 
0.8%
1.1 5
 
< 0.1%
1.167 2
 
< 0.1%
1.179 1
 
< 0.1%
ValueCountFrequency (%)
10 11078
2.3%
9.98 1
 
< 0.1%
9.9 8
 
< 0.1%
9.872 1
 
< 0.1%
9.833 3
 
< 0.1%
9.8 74
 
< 0.1%
9.769 1
 
< 0.1%
9.75 11
 
< 0.1%
9.7 78
 
< 0.1%
9.684 1
 
< 0.1%

vote_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct3591
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.514423
Minimum0
Maximum34495
Zeros252216
Zeros (%)51.4%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:25.064418image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile57
Maximum34495
Range34495
Interquartile range (IQR)3

Descriptive statistics

Standard deviation483.46681
Coefficient of variation (CV)11.37183
Kurtosis995.83367
Mean42.514423
Median Absolute Deviation (MAD)0
Skewness26.993609
Sum20862295
Variance233740.16
MonotonicityDecreasing
2024-07-13T16:33:25.126719image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 252216
51.4%
1 64787
 
13.2%
2 31132
 
6.3%
3 19990
 
4.1%
4 15294
 
3.1%
5 11441
 
2.3%
6 8764
 
1.8%
7 6899
 
1.4%
8 5490
 
1.1%
9 4650
 
0.9%
Other values (3581) 70048
 
14.3%
ValueCountFrequency (%)
0 252216
51.4%
1 64787
 
13.2%
2 31132
 
6.3%
3 19990
 
4.1%
4 15294
 
3.1%
5 11441
 
2.3%
6 8764
 
1.8%
7 6899
 
1.4%
8 5490
 
1.1%
9 4650
 
0.9%
ValueCountFrequency (%)
34495 1
< 0.1%
32571 1
< 0.1%
30619 1
< 0.1%
29815 1
< 0.1%
29166 1
< 0.1%
28894 1
< 0.1%
27713 1
< 0.1%
27238 1
< 0.1%
26638 1
< 0.1%
25893 1
< 0.1%

release_date
Date

MISSING 

Distinct36380
Distinct (%)7.8%
Missing25852
Missing (%)5.3%
Memory size3.7 MiB
Minimum1800-09-11 00:00:00
Maximum2032-04-11 00:00:00
2024-07-13T16:33:25.190137image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:25.255084image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

revenue
Real number (ℝ)

SKEWED  ZEROS 

Distinct13521
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1500022.2
Minimum0
Maximum3 × 109
Zeros457493
Zeros (%)93.2%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:25.319871image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile729497.77
Maximum3 × 109
Range3 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation25267302
Coefficient of variation (CV)16.844618
Kurtosis2834.5054
Mean1500022.2
Median Absolute Deviation (MAD)0
Skewness41.801623
Sum7.360774 × 1011
Variance6.3843653 × 1014
MonotonicityNot monotonic
2024-07-13T16:33:25.382014image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 457493
93.2%
729497.7705 16670
 
3.4%
100000 99
 
< 0.1%
1000000 87
 
< 0.1%
10000 82
 
< 0.1%
2000000 75
 
< 0.1%
1 66
 
< 0.1%
10000000 62
 
< 0.1%
500000 57
 
< 0.1%
100 54
 
< 0.1%
Other values (13511) 15966
 
3.3%
ValueCountFrequency (%)
0 457493
93.2%
1 66
 
< 0.1%
2 27
 
< 0.1%
3 21
 
< 0.1%
4 13
 
< 0.1%
5 26
 
< 0.1%
6 11
 
< 0.1%
7 8
 
< 0.1%
8 14
 
< 0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
3000000000 1
< 0.1%
2930000000 1
< 0.1%
2923706026 1
< 0.1%
2800000000 1
< 0.1%
2320250281 1
< 0.1%
2264162353 1
< 0.1%
2068223624 1
< 0.1%
2052415039 1
< 0.1%
1921847111 1
< 0.1%
1671537444 1
< 0.1%

runtime
Real number (ℝ)

SKEWED 

Distinct701
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean97.273458
Minimum40
Maximum14400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:25.442694image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile50
Q175
median90
Q3108
95-th percentile165
Maximum14400
Range14360
Interquartile range (IQR)33

Descriptive statistics

Standard deviation61.533162
Coefficient of variation (CV)0.63257916
Kurtosis15253.737
Mean97.273458
Median Absolute Deviation (MAD)16
Skewness79.050666
Sum47733156
Variance3786.33
MonotonicityNot monotonic
2024-07-13T16:33:25.648048image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 28328
 
5.8%
60 12637
 
2.6%
85 11649
 
2.4%
100 11515
 
2.3%
80 11203
 
2.3%
95 10692
 
2.2%
88 9011
 
1.8%
120 8992
 
1.8%
92 8391
 
1.7%
93 8382
 
1.7%
Other values (691) 369911
75.4%
ValueCountFrequency (%)
40 3030
0.6%
41 1216
 
0.2%
42 1801
0.4%
43 1806
0.4%
44 2357
0.5%
45 4279
0.9%
46 2011
0.4%
47 2162
0.4%
48 2391
0.5%
49 1751
0.4%
ValueCountFrequency (%)
14400 1
< 0.1%
13319 1
< 0.1%
12480 1
< 0.1%
9000 1
< 0.1%
7200 1
< 0.1%
5700 1
< 0.1%
5220 1
< 0.1%
4320 1
< 0.1%
3720 1
< 0.1%
2880 1
< 0.1%

budget
Real number (ℝ)

SKEWED  ZEROS 

Distinct3780
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean572161.49
Minimum0
Maximum8.88 × 108
Zeros457493
Zeros (%)93.2%
Negative0
Negative (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:25.711171image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile90000
Maximum8.88 × 108
Range8.88 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6816937.3
Coefficient of variation (CV)11.914359
Kurtosis1307.8637
Mean572161.49
Median Absolute Deviation (MAD)0
Skewness26.306082
Sum2.8076594 × 1011
Variance4.6470634 × 1013
MonotonicityNot monotonic
2024-07-13T16:33:25.771786image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 457493
93.2%
278702.5178 5795
 
1.2%
1000000 748
 
0.2%
10000 626
 
0.1%
500000 573
 
0.1%
2000000 563
 
0.1%
100000 541
 
0.1%
5000000 501
 
0.1%
10000000 492
 
0.1%
3000000 472
 
0.1%
Other values (3770) 22907
 
4.7%
ValueCountFrequency (%)
0 457493
93.2%
1 148
 
< 0.1%
2 69
 
< 0.1%
3 50
 
< 0.1%
4 21
 
< 0.1%
5 76
 
< 0.1%
6 22
 
< 0.1%
7 27
 
< 0.1%
8 15
 
< 0.1%
9 9
 
< 0.1%
ValueCountFrequency (%)
888000000 1
 
< 0.1%
600000000 1
 
< 0.1%
460000000 1
 
< 0.1%
417549000 1
 
< 0.1%
379000000 1
 
< 0.1%
365000000 1
 
< 0.1%
356000000 1
 
< 0.1%
340000000 1
 
< 0.1%
300000000 6
< 0.1%
294700000 1
 
< 0.1%
Distinct462072
Distinct (%)94.2%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:26.048646image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length1000
Median length789
Mean length297.18803
Min length1

Characters and Unicode

Total characters145833435
Distinct characters2793
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique459457 ?
Unique (%)93.6%

Sample

1st rowCobb, a skilled thief who commits corporate espionage by infiltrating the subconscious of his targets is offered a chance to regain his old life as payment for a task considered to be impossible: "inception", the implantation of another person's idea into a target's subconscious.
2nd rowThe adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage.
3rd rowBatman raises the stakes in his war on crime. With the help of Lt. Jim Gordon and District Attorney Harvey Dent, Batman sets out to dismantle the remaining criminal organizations that plague the streets. The partnership proves to be effective, but they soon find themselves prey to a reign of chaos unleashed by a rising criminal mastermind known to the terrified citizens of Gotham as the Joker.
4th rowIn the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.
5th rowWhen an unexpected enemy emerges and threatens global safety and security, Nick Fury, director of the international peacekeeping agency known as S.H.I.E.L.D., finds himself in need of a team to pull the world back from the brink of disaster. Spanning the globe, a daring recruitment effort begins!
ValueCountFrequency (%)
the 1413466
 
5.7%
a 853469
 
3.4%
and 800344
 
3.2%
of 744779
 
3.0%
to 659884
 
2.7%
in 498214
 
2.0%
is 326761
 
1.3%
his 274427
 
1.1%
with 240876
 
1.0%
her 216840
 
0.9%
Other values (451919) 18817153
75.7%
2024-07-13T16:33:26.406187image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24439169
16.8%
e 13726519
 
9.4%
a 9468676
 
6.5%
t 9453732
 
6.5%
i 8499596
 
5.8%
o 8448621
 
5.8%
n 8232510
 
5.6%
s 7714690
 
5.3%
r 7367693
 
5.1%
h 5982588
 
4.1%
Other values (2783) 42499641
29.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 145833435
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
24439169
16.8%
e 13726519
 
9.4%
a 9468676
 
6.5%
t 9453732
 
6.5%
i 8499596
 
5.8%
o 8448621
 
5.8%
n 8232510
 
5.6%
s 7714690
 
5.3%
r 7367693
 
5.1%
h 5982588
 
4.1%
Other values (2783) 42499641
29.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 145833435
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
24439169
16.8%
e 13726519
 
9.4%
a 9468676
 
6.5%
t 9453732
 
6.5%
i 8499596
 
5.8%
o 8448621
 
5.8%
n 8232510
 
5.6%
s 7714690
 
5.3%
r 7367693
 
5.1%
h 5982588
 
4.1%
Other values (2783) 42499641
29.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 145833435
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
24439169
16.8%
e 13726519
 
9.4%
a 9468676
 
6.5%
t 9453732
 
6.5%
i 8499596
 
5.8%
o 8448621
 
5.8%
n 8232510
 
5.6%
s 7714690
 
5.3%
r 7367693
 
5.1%
h 5982588
 
4.1%
Other values (2783) 42499641
29.1%

genres
Text

Distinct10157
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:26.478148image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length99
Median length92
Mean length9.9566017
Min length3

Characters and Unicode

Total characters4885814
Distinct characters29
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5884 ?
Unique (%)1.2%

Sample

1st rowAction, Science Fiction, Adventure
2nd rowAdventure, Drama, Science Fiction
3rd rowDrama, Action, Crime, Thriller
4th rowAction, Adventure, Fantasy, Science Fiction
5th rowScience Fiction, Action, Adventure
ValueCountFrequency (%)
nan 142442
19.1%
drama 131328
17.6%
documentary 79544
10.6%
comedy 74881
10.0%
romance 37120
 
5.0%
thriller 30219
 
4.0%
action 30145
 
4.0%
horror 27732
 
3.7%
music 26330
 
3.5%
crime 23289
 
3.1%
Other values (12) 144608
19.3%
2024-07-13T16:33:26.612972image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 575688
11.8%
n 505338
 
10.3%
r 429287
 
8.8%
m 370626
 
7.6%
e 340402
 
7.0%
o 325881
 
6.7%
256927
 
5.3%
, 228234
 
4.7%
y 215491
 
4.4%
D 210872
 
4.3%
Other values (19) 1427068
29.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4885814
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 575688
11.8%
n 505338
 
10.3%
r 429287
 
8.8%
m 370626
 
7.6%
e 340402
 
7.0%
o 325881
 
6.7%
256927
 
5.3%
, 228234
 
4.7%
y 215491
 
4.4%
D 210872
 
4.3%
Other values (19) 1427068
29.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4885814
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 575688
11.8%
n 505338
 
10.3%
r 429287
 
8.8%
m 370626
 
7.6%
e 340402
 
7.0%
o 325881
 
6.7%
256927
 
5.3%
, 228234
 
4.7%
y 215491
 
4.4%
D 210872
 
4.3%
Other values (19) 1427068
29.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4885814
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 575688
11.8%
n 505338
 
10.3%
r 429287
 
8.8%
m 370626
 
7.6%
e 340402
 
7.0%
o 325881
 
6.7%
256927
 
5.3%
, 228234
 
4.7%
y 215491
 
4.4%
D 210872
 
4.3%
Other values (19) 1427068
29.2%
Distinct134468
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:26.739964image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length586
Median length469
Mean length17.455561
Min length0

Characters and Unicode

Total characters8565636
Distinct characters1305
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112108 ?
Unique (%)22.8%

Sample

1st rowLegendary Pictures, Syncopy, Warner Bros. Pictures
2nd rowLegendary Pictures, Syncopy, Lynda Obst Productions
3rd rowDC Comics, Legendary Pictures, Syncopy, Isobel Griffiths, Warner Bros. Pictures
4th rowDune Entertainment, Lightstorm Entertainment, 20th Century Fox, Ingenious Media
5th rowMarvel Studios
ValueCountFrequency (%)
nan 186756
 
14.7%
films 50523
 
4.0%
productions 49225
 
3.9%
film 39053
 
3.1%
pictures 34962
 
2.8%
entertainment 26694
 
2.1%
studios 10528
 
0.8%
media 10292
 
0.8%
company 8624
 
0.7%
production 8225
 
0.6%
Other values (69854) 846212
66.6%
2024-07-13T16:33:26.943632image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 858354
 
10.0%
780636
 
9.1%
a 644217
 
7.5%
i 632904
 
7.4%
e 541192
 
6.3%
o 495022
 
5.8%
t 444925
 
5.2%
r 443023
 
5.2%
s 354579
 
4.1%
l 310916
 
3.6%
Other values (1295) 3059868
35.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8565636
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 858354
 
10.0%
780636
 
9.1%
a 644217
 
7.5%
i 632904
 
7.4%
e 541192
 
6.3%
o 495022
 
5.8%
t 444925
 
5.2%
r 443023
 
5.2%
s 354579
 
4.1%
l 310916
 
3.6%
Other values (1295) 3059868
35.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8565636
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 858354
 
10.0%
780636
 
9.1%
a 644217
 
7.5%
i 632904
 
7.4%
e 541192
 
6.3%
o 495022
 
5.8%
t 444925
 
5.2%
r 443023
 
5.2%
s 354579
 
4.1%
l 310916
 
3.6%
Other values (1295) 3059868
35.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8565636
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 858354
 
10.0%
780636
 
9.1%
a 644217
 
7.5%
i 632904
 
7.4%
e 541192
 
6.3%
o 495022
 
5.8%
t 444925
 
5.2%
r 443023
 
5.2%
s 354579
 
4.1%
l 310916
 
3.6%
Other values (1295) 3059868
35.7%
Distinct8213
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:27.042816image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length596
Median length333
Mean length11.258824
Min length3

Characters and Unicode

Total characters5524829
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5828 ?
Unique (%)1.2%

Sample

1st rowUnited Kingdom, United States of America
2nd rowUnited Kingdom, United States of America
3rd rowUnited Kingdom, United States of America
4th rowUnited States of America, United Kingdom
5th rowUnited States of America
ValueCountFrequency (%)
united 148508
15.5%
nan 144820
15.1%
states 124222
13.0%
of 124216
13.0%
america 124216
13.0%
japan 30322
 
3.2%
kingdom 24098
 
2.5%
france 21858
 
2.3%
germany 19658
 
2.1%
india 14877
 
1.6%
Other values (281) 179617
18.8%
2024-07-13T16:33:27.222020image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 680016
12.3%
n 649595
11.8%
e 524311
 
9.5%
465701
 
8.4%
t 447684
 
8.1%
i 414290
 
7.5%
d 224914
 
4.1%
r 223170
 
4.0%
o 204884
 
3.7%
m 177242
 
3.2%
Other values (47) 1513022
27.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5524829
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 680016
12.3%
n 649595
11.8%
e 524311
 
9.5%
465701
 
8.4%
t 447684
 
8.1%
i 414290
 
7.5%
d 224914
 
4.1%
r 223170
 
4.0%
o 204884
 
3.7%
m 177242
 
3.2%
Other values (47) 1513022
27.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5524829
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 680016
12.3%
n 649595
11.8%
e 524311
 
9.5%
465701
 
8.4%
t 447684
 
8.1%
i 414290
 
7.5%
d 224914
 
4.1%
r 223170
 
4.0%
o 204884
 
3.7%
m 177242
 
3.2%
Other values (47) 1513022
27.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5524829
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 680016
12.3%
n 649595
11.8%
e 524311
 
9.5%
465701
 
8.4%
t 447684
 
8.1%
i 414290
 
7.5%
d 224914
 
4.1%
r 223170
 
4.0%
o 204884
 
3.7%
m 177242
 
3.2%
Other values (47) 1513022
27.4%
Distinct5976
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:27.313939image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length172
Median length145
Mean length6.9363699
Min length3

Characters and Unicode

Total characters3403753
Distinct characters56
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4073 ?
Unique (%)0.8%

Sample

1st rowEnglish, French, Japanese, Swahili
2nd rowEnglish
3rd rowEnglish, Mandarin
4th rowEnglish, Spanish
5th rowEnglish, Hindi, Russian
ValueCountFrequency (%)
english 176692
32.3%
nan 126597
23.1%
japanese 32132
 
5.9%
french 26216
 
4.8%
spanish 23755
 
4.3%
german 19867
 
3.6%
italian 12962
 
2.4%
russian 11453
 
2.1%
mandarin 10786
 
2.0%
portuguese 7597
 
1.4%
Other values (176) 98990
18.1%
2024-07-13T16:33:27.482134image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 640952
18.8%
a 386286
11.3%
i 302738
8.9%
s 287578
8.4%
h 253539
 
7.4%
l 216124
 
6.3%
g 209080
 
6.1%
E 177240
 
5.2%
e 173801
 
5.1%
r 98830
 
2.9%
Other values (46) 657585
19.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3403753
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 640952
18.8%
a 386286
11.3%
i 302738
8.9%
s 287578
8.4%
h 253539
 
7.4%
l 216124
 
6.3%
g 209080
 
6.1%
E 177240
 
5.2%
e 173801
 
5.1%
r 98830
 
2.9%
Other values (46) 657585
19.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3403753
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 640952
18.8%
a 386286
11.3%
i 302738
8.9%
s 287578
8.4%
h 253539
 
7.4%
l 216124
 
6.3%
g 209080
 
6.1%
E 177240
 
5.2%
e 173801
 
5.1%
r 98830
 
2.9%
Other values (46) 657585
19.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3403753
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 640952
18.8%
a 386286
11.3%
i 302738
8.9%
s 287578
8.4%
h 253539
 
7.4%
l 216124
 
6.3%
g 209080
 
6.1%
E 177240
 
5.2%
e 173801
 
5.1%
r 98830
 
2.9%
Other values (46) 657585
19.3%
Distinct154
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
2024-07-13T16:33:27.585646image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length34
Median length7
Mean length7.1373456
Min length3

Characters and Unicode

Total characters3502374
Distinct characters63
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowEnglish
2nd rowEnglish
3rd rowEnglish
4th rowEnglish
5th rowEnglish
ValueCountFrequency (%)
english 283382
56.7%
japanese 33232
 
6.6%
french 22787
 
4.6%
spanish 20750
 
4.1%
german 16988
 
3.4%
chinese 10611
 
2.1%
italian 10475
 
2.1%
russian 9292
 
1.9%
portuguese 6873
 
1.4%
korean 6834
 
1.4%
Other values (174) 78932
 
15.8%
2024-07-13T16:33:27.760256image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 463275
13.2%
s 396503
11.3%
i 393590
11.2%
h 361431
10.3%
l 319748
9.1%
g 307386
8.8%
E 284315
8.1%
a 213782
6.1%
e 182417
 
5.2%
r 81107
 
2.3%
Other values (53) 498820
14.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3502374
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 463275
13.2%
s 396503
11.3%
i 393590
11.2%
h 361431
10.3%
l 319748
9.1%
g 307386
8.8%
E 284315
8.1%
a 213782
6.1%
e 182417
 
5.2%
r 81107
 
2.3%
Other values (53) 498820
14.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3502374
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 463275
13.2%
s 396503
11.3%
i 393590
11.2%
h 361431
10.3%
l 319748
9.1%
g 307386
8.8%
E 284315
8.1%
a 213782
6.1%
e 182417
 
5.2%
r 81107
 
2.3%
Other values (53) 498820
14.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3502374
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 463275
13.2%
s 396503
11.3%
i 393590
11.2%
h 361431
10.3%
l 319748
9.1%
g 307386
8.8%
E 284315
8.1%
a 213782
6.1%
e 182417
 
5.2%
r 81107
 
2.3%
Other values (53) 498820
14.2%

action
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
460570 
1
 
30141

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

Length

2024-07-13T16:33:27.829246image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:27.871480image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

Most occurring characters

ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 460570
93.9%
1 30141
 
6.1%

adventure
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
474938 
1
 
15773

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

Length

2024-07-13T16:33:27.912029image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:27.951530image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

Most occurring characters

ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 474938
96.8%
1 15773
 
3.2%

animation
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
481448 
1
 
9263

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

Length

2024-07-13T16:33:27.993998image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.032272image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

Most occurring characters

ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 481448
98.1%
1 9263
 
1.9%

comedy
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
415842 
1
74869 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

Length

2024-07-13T16:33:28.074223image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.112566image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

Most occurring characters

ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 415842
84.7%
1 74869
 
15.3%

crime
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
467424 
1
 
23287

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

Length

2024-07-13T16:33:28.154897image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.194182image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

Most occurring characters

ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 467424
95.3%
1 23287
 
4.7%

documentary
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
411189 
1
79522 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

Length

2024-07-13T16:33:28.235231image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.274813image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

Most occurring characters

ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 411189
83.8%
1 79522
 
16.2%

drama
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
359414 
1
131297 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

Length

2024-07-13T16:33:28.317393image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.355892image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

Most occurring characters

ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 359414
73.2%
1 131297
 
26.8%

family
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
475517 
1
 
15194

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

Length

2024-07-13T16:33:28.399226image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.437685image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

Most occurring characters

ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 475517
96.9%
1 15194
 
3.1%

fantasy
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
479445 
1
 
11266

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

Length

2024-07-13T16:33:28.479444image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.515957image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

Most occurring characters

ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 479445
97.7%
1 11266
 
2.3%

history
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
479943 
1
 
10768

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

Length

2024-07-13T16:33:28.556924image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.596155image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 479943
97.8%
1 10768
 
2.2%

horror
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
462983 
1
 
27728

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

Length

2024-07-13T16:33:28.637539image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.677037image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

Most occurring characters

ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 462983
94.3%
1 27728
 
5.7%

music
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
464394 
1
 
26317

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

Length

2024-07-13T16:33:28.819942image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.858710image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

Most occurring characters

ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 464394
94.6%
1 26317
 
5.4%

mystery
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
478795 
1
 
11916

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

Length

2024-07-13T16:33:28.900950image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:28.939253image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

Most occurring characters

ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 478795
97.6%
1 11916
 
2.4%

romance
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
453595 
1
 
37116

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

Length

2024-07-13T16:33:28.981599image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.020140image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

Most occurring characters

ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 453595
92.4%
1 37116
 
7.6%

science_fiction
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
479725 
1
 
10986

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

Length

2024-07-13T16:33:29.061356image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.100484image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 479725
97.8%
1 10986
 
2.2%

tv_movie
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
473006 
1
 
17705

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

Length

2024-07-13T16:33:29.141745image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.181180image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

Most occurring characters

ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 473006
96.4%
1 17705
 
3.6%

thriller
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
460494 
1
 
30217

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

Length

2024-07-13T16:33:29.222534image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.260400image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 460494
93.8%
1 30217
 
6.2%

war
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
483785 
1
 
6926

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

Length

2024-07-13T16:33:29.302222image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.340629image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

Most occurring characters

ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 483785
98.6%
1 6926
 
1.4%

western
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
0
484609 
1
 
6102

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters490711
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

Length

2024-07-13T16:33:29.382652image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.420685image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

Most occurring characters

ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 490711
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 484609
98.8%
1 6102
 
1.2%

runtime_category
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
Short Films
323183 
Mid-Length Films
119501 
Long Films
 
29827
Super Long Films
 
18200

Length

Max length16
Median length11
Mean length12.342293
Min length10

Characters and Unicode

Total characters6056499
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLong Films
2nd rowLong Films
3rd rowLong Films
4th rowLong Films
5th rowLong Films

Common Values

ValueCountFrequency (%)
Short Films 323183
65.9%
Mid-Length Films 119501
 
24.4%
Long Films 29827
 
6.1%
Super Long Films 18200
 
3.7%

Length

2024-07-13T16:33:29.467211image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.512877image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
films 490711
49.1%
short 323183
32.3%
mid-length 119501
 
12.0%
long 48027
 
4.8%
super 18200
 
1.8%

Most occurring characters

ValueCountFrequency (%)
i 610212
10.1%
508911
 
8.4%
s 490711
 
8.1%
m 490711
 
8.1%
F 490711
 
8.1%
l 490711
 
8.1%
h 442684
 
7.3%
t 442684
 
7.3%
o 371210
 
6.1%
S 341383
 
5.6%
Other values (10) 1376571
22.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6056499
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 610212
10.1%
508911
 
8.4%
s 490711
 
8.1%
m 490711
 
8.1%
F 490711
 
8.1%
l 490711
 
8.1%
h 442684
 
7.3%
t 442684
 
7.3%
o 371210
 
6.1%
S 341383
 
5.6%
Other values (10) 1376571
22.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6056499
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 610212
10.1%
508911
 
8.4%
s 490711
 
8.1%
m 490711
 
8.1%
F 490711
 
8.1%
l 490711
 
8.1%
h 442684
 
7.3%
t 442684
 
7.3%
o 371210
 
6.1%
S 341383
 
5.6%
Other values (10) 1376571
22.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6056499
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 610212
10.1%
508911
 
8.4%
s 490711
 
8.1%
m 490711
 
8.1%
F 490711
 
8.1%
l 490711
 
8.1%
h 442684
 
7.3%
t 442684
 
7.3%
o 371210
 
6.1%
S 341383
 
5.6%
Other values (10) 1376571
22.7%

budget_category
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
Micro-Budget
468492 
Low-Budget
 
14447
Mid-Budget
 
6394
High-Budget
 
1378

Length

Max length12
Median length12
Mean length11.91225
Min length10

Characters and Unicode

Total characters5845472
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHigh-Budget
2nd rowHigh-Budget
3rd rowHigh-Budget
4th rowHigh-Budget
5th rowHigh-Budget

Common Values

ValueCountFrequency (%)
Micro-Budget 468492
95.5%
Low-Budget 14447
 
2.9%
Mid-Budget 6394
 
1.3%
High-Budget 1378
 
0.3%

Length

2024-07-13T16:33:29.562533image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-13T16:33:29.609058image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
micro-budget 468492
95.5%
low-budget 14447
 
2.9%
mid-budget 6394
 
1.3%
high-budget 1378
 
0.3%

Most occurring characters

ValueCountFrequency (%)
d 497105
8.5%
g 492089
8.4%
- 490711
8.4%
B 490711
8.4%
u 490711
8.4%
e 490711
8.4%
t 490711
8.4%
o 482939
8.3%
i 476264
8.1%
M 474886
8.1%
Other values (6) 968634
16.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5845472
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
d 497105
8.5%
g 492089
8.4%
- 490711
8.4%
B 490711
8.4%
u 490711
8.4%
e 490711
8.4%
t 490711
8.4%
o 482939
8.3%
i 476264
8.1%
M 474886
8.1%
Other values (6) 968634
16.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5845472
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
d 497105
8.5%
g 492089
8.4%
- 490711
8.4%
B 490711
8.4%
u 490711
8.4%
e 490711
8.4%
t 490711
8.4%
o 482939
8.3%
i 476264
8.1%
M 474886
8.1%
Other values (6) 968634
16.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5845472
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
d 497105
8.5%
g 492089
8.4%
- 490711
8.4%
B 490711
8.4%
u 490711
8.4%
e 490711
8.4%
t 490711
8.4%
o 482939
8.3%
i 476264
8.1%
M 474886
8.1%
Other values (6) 968634
16.6%

Interactions

2024-07-13T16:33:21.561769image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:19.677942image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.183977image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.541376image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.894109image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.208243image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.617126image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:19.844793image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.243059image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.602002image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.946885image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.265976image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.671862image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:19.933987image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.300474image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.658840image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.998007image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.323907image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.728138image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.006549image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.363399image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.720822image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.048955image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.382571image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.785013image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.067173image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.423576image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.783548image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.106142image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.442663image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.841505image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.128803image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.488488image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:20.844976image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.158506image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-13T16:33:21.503333image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Missing values

2024-07-13T16:33:22.018091image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-13T16:33:22.667028image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

idtitlevote_averagevote_countrelease_daterevenueruntimebudgetsynopsisgenresproduction_companiesproduction_countriesspoken_languagesoriginal_languageactionadventureanimationcomedycrimedocumentarydramafamilyfantasyhistoryhorrormusicmysteryromancescience_fictiontv_moviethrillerwarwesternruntime_categorybudget_category
027205Inception8.364344952010-07-158.255328e+08148160000000.0Cobb, a skilled thief who commits corporate espionage by infiltrating the subconscious of his targets is offered a chance to regain his old life as payment for a task considered to be impossible: "inception", the implantation of another person's idea into a target's subconscious.Action, Science Fiction, AdventureLegendary Pictures, Syncopy, Warner Bros. PicturesUnited Kingdom, United States of AmericaEnglish, French, Japanese, SwahiliEnglish1100000000000010000Long FilmsHigh-Budget
1157336Interstellar8.417325712014-11-057.017292e+08169165000000.0The adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage.Adventure, Drama, Science FictionLegendary Pictures, Syncopy, Lynda Obst ProductionsUnited Kingdom, United States of AmericaEnglishEnglish0100001000000010000Long FilmsHigh-Budget
2155The Dark Knight8.512306192008-07-161.004558e+09152185000000.0Batman raises the stakes in his war on crime. With the help of Lt. Jim Gordon and District Attorney Harvey Dent, Batman sets out to dismantle the remaining criminal organizations that plague the streets. The partnership proves to be effective, but they soon find themselves prey to a reign of chaos unleashed by a rising criminal mastermind known to the terrified citizens of Gotham as the Joker.Drama, Action, Crime, ThrillerDC Comics, Legendary Pictures, Syncopy, Isobel Griffiths, Warner Bros. PicturesUnited Kingdom, United States of AmericaEnglish, MandarinEnglish1000101000000000100Long FilmsHigh-Budget
319995Avatar7.573298152009-12-152.923706e+09162237000000.0In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.Action, Adventure, Fantasy, Science FictionDune Entertainment, Lightstorm Entertainment, 20th Century Fox, Ingenious MediaUnited States of America, United KingdomEnglish, SpanishEnglish1100000010000010000Long FilmsHigh-Budget
424428The Avengers7.710291662012-04-251.518816e+09143220000000.0When an unexpected enemy emerges and threatens global safety and security, Nick Fury, director of the international peacekeeping agency known as S.H.I.E.L.D., finds himself in need of a team to pull the world back from the brink of disaster. Spanning the globe, a daring recruitment effort begins!Science Fiction, Action, AdventureMarvel StudiosUnited States of AmericaEnglish, Hindi, RussianEnglish1100000000000010000Long FilmsHigh-Budget
5293660Deadpool7.606288942016-02-097.831000e+0810858000000.0The origin story of former Special Forces operative turned mercenary Wade Wilson, who, after being subjected to a rogue experiment that leaves him with accelerated healing powers, adopts the alter ego Deadpool. Armed with his new abilities and a dark, twisted sense of humor, Deadpool hunts down the man who nearly destroyed his life.Action, Adventure, Comedy20th Century Fox, The Donners' Company, Genre FilmsUnited States of AmericaEnglishEnglish1101000000000000000Mid-Length FilmsHigh-Budget
6299536Avengers: Infinity War8.255277132018-04-252.052415e+09149300000000.0As the Avengers and their allies have continued to protect the world from threats too large for any one hero to handle, a new danger has emerged from the cosmic shadows: Thanos. A despot of intergalactic infamy, his goal is to collect all six Infinity Stones, artifacts of unimaginable power, and use them to inflict his twisted will on all of reality. Everything the Avengers have fought for has led up to this moment - the fate of Earth and existence itself has never been more uncertain.Adventure, Action, Science FictionMarvel StudiosUnited States of AmericaEnglish, XhosaEnglish1100000000000010000Long FilmsHigh-Budget
7550Fight Club8.438272381999-10-151.008538e+0813963000000.0A ticking-time-bomb insomniac and a slippery soap salesman channel primal male aggression into a shocking new form of therapy. Their concept catches on, with underground "fight clubs" forming in every town, until an eccentric gets in the way and ignites an out-of-control spiral toward oblivion.DramaRegency Enterprises, Fox 2000 Pictures, Taurus Film, Atman Entertainment, Knickerbocker Films, The Linson Company, 20th Century FoxUnited States of AmericaEnglishEnglish0000001000000000000Mid-Length FilmsHigh-Budget
8118340Guardians of the Galaxy7.906266382014-07-307.727766e+08121170000000.0Light years from Earth, 26 years after being abducted, Peter Quill finds himself the prime target of a manhunt after discovering an orb wanted by Ronan the Accuser.Action, Science Fiction, AdventureMarvel StudiosUnited States of AmericaEnglishEnglish1100000000000010000Mid-Length FilmsHigh-Budget
9680Pulp Fiction8.488258931994-09-102.139000e+081548500000.0A burger-loving hit man, his philosophical partner, a drug-addled gangster's moll and a washed-up boxer converge in this sprawling, comedic crime caper. Their adventures unfurl in three stories that ingeniously trip back and forth in time.Thriller, CrimeMiramax, A Band Apart, Jersey FilmsUnited States of AmericaEnglish, Spanish, FrenchEnglish0000100000000000100Long FilmsMid-Budget
idtitlevote_averagevote_countrelease_daterevenueruntimebudgetsynopsisgenresproduction_companiesproduction_countriesspoken_languagesoriginal_languageactionadventureanimationcomedycrimedocumentarydramafamilyfantasyhistoryhorrormusicmysteryromancescience_fictiontv_moviethrillerwarwesternruntime_categorybudget_category
490701743419I Am Chut Wutty0.002016-04-050.0570.0In one of the last remaining wildernesses in South East Asia, a community of activists are struggling to defend their forest.DocumentarynannannanEnglish0000010000000000000Short FilmsMicro-Budget
490702743457Colmax University: Honor Sluts0.002012-12-060.0890.0Nowadays students are more and more promiscuous. Honestly, guys : what kind of world do we live in?! For sure a perfect world of debauchery where girls could get a bitch sucking degree. Here it is! Enjoy this very hardcore Gonzo, featuring many french porn stars!!nanColmaxnannanFrench0000000000000000000Short FilmsMicro-Budget
490703743462The Magic Blade0.002017-04-010.01080.024 years ago, "God of Sabre" Yang Chang Feng was double-crossed and murdered by someone close to him.Action, AdventurenanChinaMandarinChinese1100000000000000000Mid-Length FilmsMicro-Budget
490704743469Chnchik0.002020-10-230.0910.0An innocent girl in a village is bullied by her family and villagers for her poor behavior, and she is called Chnchik that means stupid instead of her real name. Her mistakes become a neighbor′s laughter, and her parents, who are not happy with this situation, are ashamed of her as well. However, she does not lose her dignity, and works alone to herd goats in the mountains far from the village. One day, she encounters an unfamiliar soldier who was isolated while he is training. They feel small compassion with each other and fall in love. However, a short, dreamlike moment of love on a midsummer day brings a permanent change to her life.DramaArmna ProductionArmenia, Germany, NetherlandsArmenianArmenian0000001000000000000Short FilmsMicro-Budget
490705743470Nungshi Echel0.00NaT2000.0901000.0R.C. Entertainment Present Tele Play Nungshi Echel Cast: Ratan Lai, Kajal, ThoithoinannannannanEnglish0000000000000000000Short FilmsMicro-Budget
490706287149Das Attentat - Heydrich in Prag0.001967-09-130.0900.0Based on historical events, the film tells the story of Operation Anthropoid which led to the assassination of the German SS leader Heydrich in Prague by Czech rebels led by Josef Gabcik and Jan Kubis.WarBavaria FilmGermanynanGerman0000000000000000010Short FilmsMicro-Budget
490707743426Women in an S & M Prison: Captive Flesh Demon 50.002015-04-240.01210.0nanDramaK.M.ProduceJapanJapaneseJapanese0000001000000000000Mid-Length FilmsMicro-Budget
490708287152Percival's Big Night0.002012-03-040.0860.0The idiot's guide to getting your life back on track when the only tools at your disposal are a half-assed BA in Fine Arts, a part time job as a delivery boy, some really dank weed, a bow tie, and the love of your life who has never noticed you. Until now.Drama, ComedynannannanEnglish0001001000000000000Short FilmsMicro-Budget
490709743442The Great Depression: A Job at Ford's0.001993-01-010.0520.0Just before the advent of the Great Depression, Henry Ford controlled the most important company in the most important industry in the booming American economy. His offer of high wages in exchange for hard work attracted workers to Detroit, but it began to come apart when Ford hired a private police force to speed up production and spy on employees. After the depression hit in 1929, these workers faced a new, grim reality as unemployment skyrocketed.DocumentaryBlackside, Inc., Corporation for Public BroadcastingUnited States of AmericaEnglishEnglish0000010000000000000Short FilmsMicro-Budget
490710695272Cannibal Island0.001956-02-010.0750.0Though the release date says 1956, this film consists mostly of footage from a 1931 documentary called "Gow the Killer." It was the first sound film to deal with cannibalism, as it documented the social life and customs of primitive tribes that in fact did engage in cannibalism.DocumentarynannannanEnglish0000010000000000000Short FilmsMicro-Budget